En el archivo sat.txt se tienen datos de valores de espectros de pixels en una imagen de satélite utilizados para la predicción de la clase de suelo. Realice un análisis de los datos utilizando diferentes métodos y compárelos mediante Validación Cruzada. Luego aplíquelos a la muestra de test sat.tst y analice los resultados obtenidos.
La base de datos contiene los valores multi-espectrales de píxeles en regiones de 3x3 en una imágen satelital, y la clasificación asociada con el píxel central de cada región. El objetivo es predecir esta clasificación dados los valores multi-espectrales.
Esta base de datos se generó a partir del escáner multi-espectral LANDSAT. Un marco de imágenes Landsat MSS consta de cuatro imágenes digitales de la misma escena en diferentes bandas espectrales. Dos de estos están en la región visible (correspondiente aproximadamente a las regiones verdes y rojas del espectro visible) y dos están en el infrarrojo. Cada píxel es una palabra binaria de 8 bits, con 0 correspondiente al negro y 255 a blanco. La resolución espacial de un píxel es de unos 80 m x 80m. Cada imagen contiene 2340 x 3380 de tales píxeles.
La base de datos es una subárea (pequeña) de una escena, que consta de 82 x 100 píxeles. Cada línea de datos corresponde a una región de píxeles cuadrados 3x3 completamente contenida dentro de la subárea 82x100. Cada línea contiene los valores de píxeles en las cuatro bandas espectrales (convertidas en ASCII) de cada uno de los 9 píxeles en la región 3x3 y un número que indica la etiqueta de clasificación del píxel central.
La clase de cada píxel se codifica como un número, siendo estos:
1: suelo rojo
2: cultivo de algodón
3: suelo gris
4: suelo gris húmdeo
5: suelo con rastrojo de vegetación
6: mezcla de suelos
7: suelo gris muy húmedo
En cada línea de datos, los cuatro valores espectrales para el píxel superior izquierdo se dan primero, seguidos por los cuatro valores espectrales para el píxel medio superior y luego los del píxel superior derecho, y así sucesivamente de modo que los píxeles se leen en secuencia de izquierda a derecha y de arriba a abajo. Por lo tanto, los cuatro valores espectrales para los píxeles centrales están dados por los atributos 17,18,19 y 20. Si lo desea, puede usar solo estos cuatro atributos, ignorando los demás.
Dado que el enunciado menciona que pueden utilizarse solo los datos espectrales correspondientes al píxel central, buscamos extraerlos, junto con su clasificación, a una sola dataframe.
file_trn = "SAT_trn.txt"
data_trn <- read.table(file_trn, sep = "", header=F)
sat_trn_df <- data_trn %>%
dplyr::select(17:20,37) %>%
rename(B1=V17,
B2=V18,
B3=V19,
B4=V20,
C=V37)
file_tst = "SAT_tst.txt"
data_tst <- read.table(file_tst, sep = "", header=F)
sat_tst_df <- data_tst %>%
dplyr::select(17:20,37) %>%
rename(B1=V17,
B2=V18,
B3=V19,
B4=V20,
C=V37)
sat_trn_df$C[sat_trn_df$C == 7] <- 6
sat_tst_df$C[sat_tst_df$C == 7] <- 6
sat_trn_df_numeric <- data.frame(sat_trn_df)
sat_tst_df_numeric <- data.frame(sat_tst_df)
class_names <- c("rojo", "algodón", "gris","gris húmedo", "vegetación", "gris muy húmedo")
class_values <- c(1,2,3,4,5,6)
class_map <- setNames(class_names, class_values)
sat_trn_df$C <- class_map[sat_trn_df$C]
sat_tst_df$C <- class_map[sat_tst_df$C]
| B1 | B2 | B3 | B4 | C |
|---|---|---|---|---|
| 92 | 112 | 118 | 85 | gris |
| 84 | 103 | 104 | 81 | gris |
| 84 | 99 | 104 | 78 | gris |
| 84 | 99 | 104 | 81 | gris |
| 76 | 99 | 104 | 81 | gris |
| 76 | 99 | 108 | 85 | gris |
| 80 | 112 | 118 | 88 | gris |
| 80 | 107 | 113 | 85 | gris |
| 76 | 91 | 104 | 74 | gris húmedo |
| 76 | 95 | 100 | 78 | gris húmedo |
Podemos comprobar que tenemos valores acotados entre 0 y 255 para cuatro bandas espectrales y una categoría de suelo asignada en cada fila. Los contenidos pueden visualizarse mejor utilizando un gráfico de coordenadas paralelas.
Donde cada linea representa una observación para las 4 bandas. La normalización de los valores no es necesaria ya que \(B_1:B_4\) estan en la misma escala. Podemos ver que la distribución de suelo en los datos de entrenamiento y prueba es practicamente la misma.
set.seed(20)
n <- nrow(sat_trn_df)
# Entrenamiento del Modelo con 10-fold CV
sat_rpart <- rpart(C ~ B1 + B2 + B3 + B4, data= sat_trn_df, method = "class", cp=-1,xval=10)
sat_rpart_summary <- printcp(sat_rpart)
##
## Classification tree:
## rpart(formula = C ~ B1 + B2 + B3 + B4, data = sat_trn_df, method = "class",
## cp = -1, xval = 10)
##
## Variables actually used in tree construction:
## [1] B1 B2 B3 B4
##
## Root node error: 3363/4435 = 0.75829
##
## n= 4435
##
## CP nsplit rel error xerror xstd
## 1 2.6167e-01 0 1.00000 1.00000 0.0084779
## 2 2.5335e-01 1 0.73833 0.76896 0.0097636
## 3 1.2905e-01 2 0.48498 0.48498 0.0095487
## 4 6.3634e-02 3 0.35593 0.35742 0.0088020
## 5 1.7544e-02 4 0.29230 0.29379 0.0082400
## 6 1.4570e-02 5 0.27475 0.28189 0.0081181
## 7 6.6905e-03 7 0.24561 0.24234 0.0076694
## 8 5.9471e-03 10 0.22034 0.23253 0.0075467
## 9 5.6497e-03 11 0.21439 0.22629 0.0074660
## 10 4.7577e-03 13 0.20309 0.21885 0.0073673
## 11 4.6090e-03 14 0.19833 0.20934 0.0072364
## 12 3.2709e-03 16 0.18912 0.20636 0.0071945
## 13 1.6354e-03 17 0.18585 0.19774 0.0070698
## 14 1.3381e-03 19 0.18258 0.19625 0.0070479
## 15 1.1894e-03 21 0.17990 0.19536 0.0070346
## 16 1.0903e-03 22 0.17871 0.19477 0.0070258
## 17 8.9206e-04 25 0.17544 0.19715 0.0070610
## 18 6.9382e-04 31 0.17009 0.19715 0.0070610
## 19 5.9471e-04 34 0.16800 0.19715 0.0070610
## 20 4.9559e-04 48 0.15968 0.19685 0.0070567
## 21 4.4603e-04 51 0.15819 0.19387 0.0070124
## 22 3.9647e-04 55 0.15641 0.19328 0.0070035
## 23 2.9735e-04 58 0.15522 0.19328 0.0070035
## 24 1.4868e-04 61 0.15433 0.20012 0.0071047
## 25 8.4958e-05 63 0.15403 0.20101 0.0071176
## 26 7.4338e-05 70 0.15343 0.20071 0.0071133
## 27 0.0000e+00 74 0.15314 0.19952 0.0070960
## 28 -1.0000e+00 197 0.15314 0.19952 0.0070960
plotcp(sat_rpart)
which.min(sat_rpart$cptable[,4])
## 22
## 22
cp<-sat_rpart$cptable[which.min(sat_rpart$cptable[,"xerror"]),"CP"]
cp
## [1] 0.0003964714
CVErr_rpart = sat_rpart$frame[1,'dev'] / n * sat_rpart$cptable[which.min(sat_rpart$cptable[,"xerror"]),"xerror"]
CVErr_rpart
## [1] 0.1465614
poda_rpart<-prune(sat_rpart,cp=cp)
rpart.plot(poda_rpart, cex = 0.5, extra = 0)
set.seed(20)
sat_rf<- randomForest(C~., data= sat_trn_df,
mtry=sqrt(ncol(sat_trn_df)-1), importance=TRUE)
importance(sat_rf)
## 1 2 3 4 5 6
## B1 227.87903 29.96135 362.16076 58.21353 50.21856 48.37936
## B2 140.86987 43.00905 53.51638 37.62347 112.03571 41.93234
## B3 30.04423 10.40657 24.77468 15.11574 28.56435 32.31841
## B4 57.76868 24.80617 34.26451 35.82601 52.16778 74.12139
## MeanDecreaseAccuracy MeanDecreaseGini
## B1 186.14425 1186.7218
## B2 119.11363 914.1809
## B3 48.88156 425.2869
## B4 91.94379 853.1770
varImpPlot(sat_rf)
CVErr_rf <- unname(sat_rf$err.rate[nrow(sat_rf$err.rate),1])
CVErr_rf
## [1] 0.1492672
set.seed(20)
sat_knn <- train(C ~ ., data = sat_trn_df,
method = "knn",
preProcess = c("center","scale"),
trControl = trainControl(method = "cv", number = 10),
tuneGrid = expand.grid(k = 1:100),
metric = "Accuracy")
results <- sat_knn$results
ggplot(results, aes(x = k, y = Accuracy)) +
geom_line(color = "steelblue", size = 1.2) +
xlab("Valor de K") +
ylab("Exactitud") +
ggtitle("Rendimiento para diferentes valores de K") +
theme_bw() +
theme(plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
axis.title.x = element_text(face = "bold", size = 12),
axis.title.y = element_text(face = "bold", size = 12),
axis.text = element_text(size = 10),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank())
best_k <- results$k[which.max(results$Accuracy)]
best_k
## [1] 43
CVErr_KNN <- 1-unname(sat_knn$results[best_k,"Accuracy"])
CVErr_KNN
## [1] 0.1400425
summary(sat_mlr)
## Call:
## nnet::multinom(formula = .outcome ~ ., data = dat, decay = param$decay)
##
## Coefficients:
## (Intercept) B1 B2 B3 B4
## 2 -9.921154 0.5867093 -0.63317495 0.29943995 -0.1122932
## 3 -36.648978 0.7617752 -0.02673514 0.02795903 -0.2534971
## 4 -10.253548 0.5886442 -0.07999703 -0.02004337 -0.2854203
## 5 7.594031 0.4558313 -0.52173889 0.21832973 -0.2099831
## 6 4.599893 0.5678520 -0.18940106 -0.05394639 -0.2887664
##
## Std. Errors:
## (Intercept) B1 B2 B3 B4
## 2 1.6977543 0.05040409 0.04477873 0.05284647 0.04508065
## 3 1.2832527 0.03391253 0.03743092 0.04631104 0.04434272
## 4 0.8182467 0.03437754 0.03467115 0.04409015 0.04040293
## 5 0.9108139 0.03161938 0.03489501 0.04077884 0.03601165
## 6 0.7841282 0.03491517 0.03359844 0.04276092 0.03778148
##
## Residual Deviance: 3755.425
## AIC: 3805.425
#sat_mlr <- nnet::multinom(C ~ ., data = sat_trn_df)
#summary(sat_rl)
# Predict the class labels for the training data
confusion_matrix <- confusionMatrix(predict(sat_mlr, newdata = sat_trn_df), sat_trn_df$C)
CVErr_mlr2 <- 1 - unname(confusion_matrix$overall['Accuracy'])
CVErr_mlr2
## [1] 0.1569335
predicted_mlr <- predict(sat_mlr, newdata = sat_trn_df)
CVErr_mlr <- mean(predicted_mlr != sat_trn_df$C)
CVErr_mlr
## [1] 0.1569335
set.seed(20)
sat_LDA <- stepclass(C ~ B1 + B2 + B3 + B4, data= sat_trn_df, method="lda", criterion = "CR", direction = "both", improvement <- 0.01, fold = 10)
## `stepwise classification', using 10-fold cross-validated correctness rate of method lda'.
## 4435 observations of 4 variables in 6 classes; direction: both
## stop criterion: improvement less than 1%.
## correctness rate: 0.56054; in: "B2"; variables (1): B2
## correctness rate: 0.79776; in: "B1"; variables (2): B2, B1
## correctness rate: 0.82616; in: "B3"; variables (3): B2, B1, B3
##
## hr.elapsed min.elapsed sec.elapsed
## 0.00 0.00 2.04
sat_LDA
## method : lda
## final model : C ~ B1 + B2 + B3
## <environment: 0x000001e68f9ce258>
##
## correctness rate = 0.8262
CVErr_LDA <- 1-unname(sat_LDA$result.pm["crossval.rate"])
set.seed(20)
sat_QDA <- stepclass(C ~ B1 + B2 + B3 + B4, data= sat_trn_df, method="qda", criterion = "CR", direction = "both", improvement <- 0.01, fold = 10)
## `stepwise classification', using 10-fold cross-validated correctness rate of method qda'.
## 4435 observations of 4 variables in 6 classes; direction: both
## stop criterion: improvement less than 1%.
## correctness rate: 0.58107; in: "B4"; variables (1): B4
## correctness rate: 0.80496; in: "B1"; variables (2): B4, B1
## correctness rate: 0.84849; in: "B2"; variables (3): B4, B1, B2
##
## hr.elapsed min.elapsed sec.elapsed
## 0.0 0.0 2.2
sat_QDA
## method : qda
## final model : C ~ B1 + B2 + B4
## <environment: 0x000001e68e56af70>
##
## correctness rate = 0.8485
CVErr_QDA <- 1-unname(sat_QDA$result.pm["crossval.rate"])
models <- c("rpart", "RF", "KNN", "RLM","LDA","QDA")
CVErrs <- c(CVErr_rpart,CVErr_rf,CVErr_KNN,CVErr_mlr,CVErr_LDA,CVErr_QDA)
CVErr_df <- data.frame(Label = models, Value = CVErrs)
ggplot(CVErr_df, aes(x = Label, y = Value)) +
geom_bar(stat = "identity", color = "black", fill = "red") + labs(x = "Modelos", y = "Error de Validación Cruzada K=10", title = "Comparación de Precisión de Modelos") +
theme(plot.title = element_text(hjust = 0.5))
#conf_mat <- caret::confusionMatrix(pred_nnet, sat_tst_df$C)
#conf_mat
#conf_matrix(sat_tst_df$C,pred_nnet)
#probs <- predict(model_nnet, newdata = sat_tst_df, type = "probs")
#ROC <- multiclass.roc(response = sat_tst_df$C, predictor = probs)
#auc(ROC)
models <- c("rpart", "RF", "KNN", "RLM","LDA","QDA")
CVErrs <- c(CLErr_rpart,CLErr_rf,CLErr_knn,CLErr_mlr,CLErr_LDA,CLErr_QDA)
CVErr_df <- data.frame(Label = models, Value = CVErrs)
ggplot(CVErr_df, aes(x = Label, y = Value)) +
geom_bar(stat = "identity", color = "black", fill = "red") + labs(x = "Modelos", y = "Error de Clasificación", title = "Comparación de Precisión de Modelos") +
theme(plot.title = element_text(hjust = 0.5))
##
## Classification tree:
## rpart(formula = C ~ ., data = sat_trn_full_df, method = "class",
## cp = -1, xval = 10)
##
## Variables actually used in tree construction:
## [1] V1 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V2 V20 V21 V22 V23 V24 V25 V26
## [20] V27 V28 V29 V3 V30 V31 V32 V33 V34 V35 V36 V4 V5 V6 V7 V8 V9
##
## Root node error: 3363/4435 = 0.75829
##
## n= 4435
##
## CP nsplit rel error xerror xstd
## 1 0.26167113 0 1.00000 1.00000 0.0084779
## 2 0.25334523 1 0.73833 0.76896 0.0097636
## 3 0.12905144 2 0.48498 0.48498 0.0095487
## 4 0.06363366 3 0.35593 0.35742 0.0088020
## 5 0.01754386 4 0.29230 0.29497 0.0082518
## 6 0.01486768 5 0.27475 0.28338 0.0081337
## 7 0.01159679 6 0.25989 0.25453 0.0078153
## 8 0.00683913 7 0.24829 0.24591 0.0077128
## 9 0.00669045 9 0.23461 0.23877 0.0076253
## 10 0.00624442 13 0.20458 0.23699 0.0076030
## 11 0.00490633 15 0.19209 0.21588 0.0073269
## 12 0.00356824 17 0.18228 0.21053 0.0072531
## 13 0.00297354 18 0.17871 0.20607 0.0071903
## 14 0.00267618 19 0.17574 0.20517 0.0071776
## 15 0.00237883 21 0.17038 0.20369 0.0071563
## 16 0.00223015 22 0.16800 0.19923 0.0070916
## 17 0.00208147 24 0.16354 0.19923 0.0070916
## 18 0.00178412 26 0.15938 0.19744 0.0070654
## 19 0.00168500 27 0.15760 0.19685 0.0070567
## 20 0.00163544 30 0.15254 0.19685 0.0070567
## 21 0.00148677 32 0.14927 0.19506 0.0070302
## 22 0.00133809 36 0.14332 0.19387 0.0070124
## 23 0.00118941 40 0.13797 0.19447 0.0070213
## 24 0.00104074 50 0.12608 0.19387 0.0070124
## 25 0.00089206 55 0.12073 0.19120 0.0069721
## 26 0.00077312 57 0.11894 0.19060 0.0069631
## 27 0.00074338 62 0.11508 0.19090 0.0069676
## 28 0.00059471 64 0.11359 0.19090 0.0069676
## 29 0.00049559 66 0.11240 0.19179 0.0069811
## 30 0.00029735 69 0.11091 0.19269 0.0069946
## 31 0.00019824 74 0.10943 0.19804 0.0070742
## 32 0.00014868 77 0.10883 0.19893 0.0070873
## 33 0.00000000 83 0.10794 0.19982 0.0071003
## 34 -1.00000000 157 0.10794 0.19982 0.0071003
## 26
## 26
## [1] 0.0007731192
## [1] 0.1445321
## 1 2 3 4 5 6
## V1 14.347479 6.523180 12.785265 8.878950 9.215792 13.385304
## V2 25.080362 8.155885 10.329826 16.965498 13.801273 14.409123
## V3 7.750380 6.264034 10.674574 11.903256 8.868406 8.914603
## V4 11.671033 9.196722 16.190272 10.119210 9.599166 15.632320
## V5 20.327046 7.747617 14.732451 8.738162 12.260005 12.968500
## V6 22.024864 9.759857 8.967969 12.434673 15.591975 14.986900
## V7 8.193229 6.455881 11.596530 10.544200 7.867005 7.465667
## V8 16.440057 8.843152 10.938300 9.559178 9.546703 12.694114
## V9 27.245723 7.757990 11.797470 7.866552 9.557962 13.395249
## V10 19.668841 10.092615 7.702224 9.208108 14.255990 14.580319
## V11 8.183801 7.410958 12.179594 13.334420 5.674602 10.092250
## V12 13.869129 9.431361 12.276045 11.954095 10.082696 12.798514
## V13 20.251935 8.567868 16.168249 8.849309 11.890796 13.474792
## V14 22.215691 10.724722 10.223932 15.692282 15.305327 13.701134
## V15 7.862138 7.007398 9.906889 11.459271 7.941938 11.304583
## V16 16.505530 11.504687 13.949191 9.006578 11.792044 18.082491
## V17 25.253907 13.485706 19.712859 15.525689 20.078888 13.963230
## V18 21.396409 14.118719 14.385012 24.411052 24.468676 17.207413
## V19 8.560900 8.542848 10.437008 13.847183 10.239672 11.134022
## V20 19.027569 15.200911 11.888635 13.264401 11.666925 17.628338
## V21 29.431963 8.872517 17.675086 14.599193 15.461604 15.775755
## V22 20.051290 10.537004 11.771828 18.662235 19.376440 21.337874
## V23 9.749778 6.501513 12.096690 17.297252 8.046496 8.803671
## V24 15.582407 9.479060 14.049516 15.247126 9.821586 14.837700
## V25 23.697043 7.937527 16.986925 6.526053 10.340832 13.092715
## V26 23.867149 7.255891 9.029530 17.755496 10.805070 10.428707
## V27 7.894474 7.505076 12.865851 9.743226 8.409628 9.225300
## V28 14.727386 10.452824 13.915661 7.308338 9.939301 15.567830
## V29 22.313159 8.546988 15.018766 10.048277 10.401912 12.934963
## V30 17.037511 8.587731 8.721241 11.704029 13.527364 13.019518
## V31 8.545563 6.403730 8.403834 10.781731 9.028152 8.305877
## V32 16.176983 8.932304 8.691565 10.082833 11.848480 14.326251
## V33 24.911565 7.287095 15.507143 5.748705 10.478895 13.887847
## V34 18.162001 8.590430 9.937206 9.282485 16.150622 17.065096
## V35 8.769205 6.280038 12.924365 11.616752 8.099357 10.025106
## V36 16.279684 6.403133 13.129658 9.659609 11.145730 14.219610
## MeanDecreaseAccuracy MeanDecreaseGini
## V1 19.84657 62.23829
## V2 32.28670 89.32196
## V3 18.44815 39.12481
## V4 20.91121 66.59387
## V5 22.37433 123.11917
## V6 31.22470 109.57090
## V7 15.57568 35.83360
## V8 21.46521 83.61104
## V9 27.50258 106.41434
## V10 29.38192 78.43511
## V11 18.88206 36.86583
## V12 23.55204 62.45378
## V13 21.89113 143.09708
## V14 30.72190 129.29556
## V15 18.39168 52.08460
## V16 25.45299 145.49467
## V17 27.32588 242.93899
## V18 33.44038 224.31144
## V19 16.77841 75.62461
## V20 23.97588 179.15180
## V21 30.97529 193.19072
## V22 33.09108 174.07645
## V23 21.39281 51.19399
## V24 23.28907 108.81224
## V25 26.31587 105.59507
## V26 30.82425 68.20054
## V27 20.38909 39.62732
## V28 21.72995 93.16480
## V29 24.21030 128.52837
## V30 27.27160 92.26471
## V31 17.24529 34.76598
## V32 20.34748 99.84895
## V33 29.07778 101.76329
## V34 29.70615 105.09069
## V35 21.32641 39.08788
## V36 24.29074 63.03664
## [1] 0.08568207
## [1] 5
## [1] 0.0942617
## Call:
## nnet::multinom(formula = .outcome ~ ., data = dat, decay = param$decay)
##
## Coefficients:
## (Intercept) V1 V2 V3 V4 V5
## 2 -10.041277 0.011633320 -0.08866011 0.048736082 -0.007836211 0.20621349
## 3 -20.378721 0.011140595 -0.06815353 0.027252249 0.039038401 0.12890553
## 4 -10.005767 0.037706249 -0.02832780 0.005182813 0.030101942 0.08771919
## 5 4.310293 0.015662639 -0.05979296 0.025596502 0.028523005 0.13158697
## 6 2.981347 0.004889374 -0.06399075 -0.020918896 0.065396066 0.15362516
## V6 V7 V8 V9 V10 V11
## 2 -0.07046031 0.05849743 -0.09916763 -0.23035971 0.05487740 -0.01018132
## 3 -0.01023674 0.02334025 -0.10407334 0.01094643 -0.07388672 0.10356648
## 4 -0.04171813 0.07508490 -0.08062009 -0.03705449 -0.05554305 0.05291187
## 5 -0.03302893 0.02956949 -0.14203745 -0.04364464 -0.07723677 0.10014638
## 6 0.01751298 0.01713530 -0.13431599 -0.03026042 -0.04818043 0.04464404
## V12 V13 V14 V15 V16 V17
## 2 0.047024992 0.11602469 -0.06530551 0.012422682 0.09757902 0.196116606
## 3 -0.053235834 0.12028057 -0.05625099 -0.013169008 0.06185362 0.021216840
## 4 -0.029957741 0.07085136 -0.05742333 0.002926769 0.03030945 0.042819710
## 5 -0.013539372 0.03250508 -0.04543212 0.028452811 0.05963729 -0.009482224
## 6 0.003462489 0.03033801 -0.04426977 0.005334445 0.04055678 -0.009438855
## V18 V19 V20 V21 V22 V23
## 2 -0.170050812 0.08470339 -0.2117381 -0.025080470 0.09688088 -0.028190776
## 3 -0.024565151 0.09392422 -0.1519914 0.043435687 0.02603040 -0.035636777
## 4 -0.052699586 0.06156555 -0.1323629 0.094373655 0.03255762 -0.007513793
## 5 -0.088499090 0.10572731 -0.2332036 0.086437722 0.01248936 -0.032978130
## 6 -0.002384114 0.07188680 -0.1813734 0.009126433 0.02387681 -0.025139193
## V24 V25 V26 V27 V28 V29
## 2 0.14146854 -0.104255928 0.01286808 0.1156963 -0.07636536 0.081602559
## 3 0.11258387 0.005678118 -0.06977875 0.1568363 -0.09402118 0.053380932
## 4 0.03780018 -0.085349494 -0.02746782 0.1701236 -0.13727885 0.035347957
## 5 0.08979900 -0.044821516 -0.03564611 0.1139013 -0.11675376 0.007046051
## 6 0.07970401 0.010761963 -0.04017290 0.0994432 -0.12406979 0.055388797
## V30 V31 V32 V33 V34 V35
## 2 -0.111281819 0.04687024 -0.035408254 0.11878873 -0.07399343 0.01072422
## 3 -0.037891231 0.01333758 -0.043551988 0.08918813 -0.02290963 -0.01924823
## 4 -0.061578044 0.09632598 -0.073387776 0.12374137 -0.08545925 0.02117023
## 5 -0.006815922 0.03950480 -0.006685306 0.09955379 -0.09604781 0.02921400
## 6 -0.067981417 0.03953664 -0.012509769 0.07101754 -0.01432718 0.01335151
## V36
## 2 -0.0410039083
## 3 -0.0007439179
## 4 -0.0184314429
## 5 -0.0439894601
## 6 -0.0664708960
##
## Std. Errors:
## (Intercept) V1 V2 V3 V4 V5 V6
## 2 0.01236324 0.04366473 0.03165150 0.03499723 0.03335968 0.05167846 0.03677143
## 3 0.34938369 0.03234255 0.02519188 0.02874874 0.02940229 0.03930459 0.02912072
## 4 0.47599482 0.03283715 0.02525149 0.02873599 0.02899265 0.04009069 0.02926500
## 5 0.21208527 0.03591308 0.02666621 0.02943234 0.02900005 0.04281700 0.03056582
## 6 0.46297310 0.03106560 0.02392134 0.02651756 0.02719780 0.03737165 0.02767196
## V7 V8 V9 V10 V11 V12 V13
## 2 0.03733509 0.03632063 0.04617490 0.03001791 0.03639816 0.03208830 0.04764122
## 3 0.02961524 0.03253631 0.03356620 0.02303724 0.02866407 0.02740302 0.03573847
## 4 0.02978146 0.03212779 0.03435290 0.02316587 0.02881193 0.02726241 0.03650879
## 5 0.03103002 0.03225148 0.03707527 0.02425580 0.02974641 0.02713716 0.03961988
## 6 0.02748290 0.03039313 0.03195555 0.02175804 0.02653014 0.02534765 0.03453444
## V14 V15 V16 V17 V18 V19 V20
## 2 0.03403094 0.03590879 0.03675849 0.05359208 0.03706961 0.03792888 0.03849453
## 3 0.02676861 0.02912685 0.03248246 0.04044278 0.02918168 0.02968905 0.03469929
## 4 0.02668878 0.02931989 0.03221597 0.04089251 0.02916575 0.03003330 0.03408425
## 5 0.02835822 0.03012988 0.03222492 0.04407052 0.03128037 0.03112775 0.03408286
## 6 0.02570572 0.02700195 0.03036498 0.03850554 0.02797236 0.02750367 0.03269578
## V21 V22 V23 V24 V25 V26 V27
## 2 0.04834975 0.03228657 0.03700916 0.03470027 0.04309672 0.02927003 0.03513068
## 3 0.03655183 0.02639977 0.02891477 0.03103616 0.03285555 0.02365707 0.02859243
## 4 0.03738730 0.02649046 0.02922960 0.03027532 0.03350740 0.02355047 0.02886155
## 5 0.04043290 0.02791366 0.03021617 0.03024462 0.03589307 0.02509726 0.02983479
## 6 0.03477958 0.02531165 0.02676384 0.02889859 0.03143473 0.02279398 0.02667284
## V28 V29 V30 V31 V32 V33 V34
## 2 0.03391676 0.05116816 0.03648290 0.03758016 0.03769108 0.04256800 0.02926133
## 3 0.02951688 0.03896867 0.02922321 0.03015095 0.03424834 0.03124130 0.02349621
## 4 0.02929196 0.03953727 0.02916499 0.03031922 0.03362662 0.03277022 0.02353478
## 5 0.02966845 0.04199949 0.03090811 0.03129439 0.03337880 0.03614112 0.02492615
## 6 0.02750901 0.03670997 0.02787971 0.02790729 0.03212398 0.03059734 0.02249722
## V35 V36
## 2 0.03600778 0.03289523
## 3 0.02876995 0.02939106
## 4 0.02880899 0.02859065
## 5 0.02978972 0.02839475
## 6 0.02666640 0.02697115
##
## Residual Deviance: 4740.439
## AIC: 5110.439
## [1] 0.172717
## correctness rate: 0.56054; in: "V18"; variables (1): V18
## correctness rate: 0.79776; in: "V17"; variables (2): V18, V17
## correctness rate: 0.82797; in: "V15"; variables (3): V18, V17, V15
##
## hr.elapsed min.elapsed sec.elapsed
## 0.00 0.00 23.76
## method : lda
## final model : C ~ V15 + V17 + V18
## <environment: 0x000001e68e554ac0>
##
## correctness rate = 0.828
## correctness rate: 0.58107; in: "V20"; variables (1): V20
## correctness rate: 0.80496; in: "V17"; variables (2): V20, V17
## correctness rate: 0.84849; in: "V18"; variables (3): V20, V17, V18
##
## hr.elapsed min.elapsed sec.elapsed
## 0.00 0.00 24.41
## method : qda
## final model : C ~ V17 + V18 + V20
## <environment: 0x000001e6812031f0>
##
## correctness rate = 0.8485
models <- c("rpart", "RF", "KNN", "RLM","LDA","QDA")
CVErrs <- c(CLErr_rpart,CLErr_rf,CLErr_knn,CLErr_mlr,CLErr_LDA,CLErr_QDA)
CVErr_df <- data.frame(Label = models, Value = CVErrs)
ggplot(CVErr_df, aes(x = Label, y = Value)) +
geom_bar(stat = "identity", color = "black", fill = "red") + labs(x = "Modelos", y = "Error de Clasificación", title = "Comparación de Precisión de Modelos") +
theme(plot.title = element_text(hjust = 0.5))